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PQ Abstract 

d 

"^ The Caspar-Klug classification of viruses whose protein sheh, caUed viral cap- 

I sid, exhibits icosahedral symmetry, has recently been extended to incorporate 

I ^^ viruses whose capsid proteins are exclusively organised in pentamers. The ap- 
proach, named 'Viral Tiling Theory', is inspired by the theory of quasicrystals, 

K-, where aperiodic Penrose tilings enjoy 5-fold and 10-fold local symmetries. This 

T-H paper analyzes the extent to which this classification approach informs dynamical 

''^ properties of the viral capsids, in particular the pattern of Raman active modes of 

^-^ vibrations, which can be observed experimentally. 

l> 

o 
> 1 Introduction 

X 

Cd Viruses are dynamical particles which can undergo vibrational movements. The study 

of normal modes of vibration of viral capsids may shed light on conformational changes 
of viruses, help understand the release of genetic material pQ in the viral replication 
procesqj and reveal how chemical bound compounds affect the flexibility of the capsid 
and hence inhibit the infectiousness of the virus. To analyze vibrational patterns in viral 
capsids, one requires knowledge of the capsid's structure and its energy function. In most 
cases, a classical mechanics treatment for viruses kept at a temperature close to 0° K, 
governed by a harmonic potential, yields a reasonable picture of any viral motion, which 
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can be exactly expressed as a superposition of normal modes. Among those, the most 
relevant in applications are the lowest frequency modes, which correspond to delocalized 
motions of a large number of atoms. 

Several detailed analyses of normal modes of vibrations have appeared recently in the 
hterature P H] , where the capsid's structure is considered at various stages of coarse- 
graining, and where the potential mimics the effect of springs in-between the capsid's 
constituents in a given coarse-grained approximation [5] . These calculations require sev- 
eral computational prowesses, as more accurate descriptions of the capsid's structure in- 
volve a rapidly increasing number of degrees of freedom. Group theoretical methods have 
long been an important ally in vibrational analyses [6] , particularly in the calculation of 
the normal modes of vibration of fuUerenes [3 El E] . They enhance the performance and 
have recently been used in studies of vibration patterns for small viruses at the atomic 
level, with a basis set including all the internal dihedral angles of the system considered, 
except for peptide bonds which are assumed to be rigid |4J. These studies however are 
very much done case by case, and do not capture potential patterns of vibrations within 
certain classes of viruses. 

The present note investigates whether Viral Tiling Theory, a recently proposed model for 
icosahedral viral capsids which solves a classification puzzle in the Caspar-Klug nomen- 
clature [To], provides a new qualitative insight in the dynamics of viruses. In particular, 
we ask whether there is a correlation between the vibrational patterns of viruses with a 
given number of coat proteins and their viral tiling. The following analysis is mainly qual- 
itative and informs on the group theoretical properties of the normal modes of vibration. 
These properties enable to identify which normal modes can be detected by Raman and 
infrared spectroscopy. The calculation of the relevant frequencies of vibration requires 
further techniques which we postpone to a future publication [11]. 

The paper is organised as follows. In Section 2, we present the tilings relevant to the 
description of a variety of viruses with triangulation numbers T = 3 and T = 7, together 
with their maximally symmetric decorations. We then confront these ideally decorated 
tiles with experimental data for Bacteriophages MS2 and IIK97, as well as for Tomato 
Bushy Stunt Virus and Simian Virus 40 in Section 3. We argue that MS2 and SV40 
capsids exhibit a centre of inversion in good approximation. In Section 4, we use the 
group theory underlying the icosahedral symmetry of the viral capsids considered to 
obtain qualitative information on their normal modes of vibration, and we offer some 
conclusions in the last section. 



2 Ideal tilings of icosahedral viral capsids 

The mathematical structure underpinning the symmetry of icosahedral capsids is the 
non-crystallographic Coxeter group H^ (also labelled Ih in the science literature) [13] . It 
contains 120 elements and is generated by a 2-fold rotation g2 and a 3-fold rotation g^, 
chosen according to the rules in Appendix A, as well as by the inversion transformation 
Qq which maps each 3-dimensional point of coordinates (x, y, z) to — (x, y, z). The sixty 
proper rotations generated by g2 and g^ form the subgroup X, which is the relevant 
group for the study of vibrational patterns whenever the viral capsid does not possess a 



centre of inversion. By this we mean that the distribution of the A^ capsid constituents 
Ci,i = 1, .., N, each being represented by a vector whose origin coincides with the centre 
of the capsid and whose components are {xi,yi,Zi), is not invariant under the inversion 
operation which maps {xi,yi,Zi) to —{xi,yi,Zi). Ahhough experimental data do not 
support the existence of capsids with strict centre of inversion, we argue in Section 3 
that some have it in very good approximation, and therefore their normal modes of 
vibration can be analysed with the help of the full icosahedral group rather than its 
subgroup I. 

The afiinization of H^ has been recently constructed in f[^ and can be used to determine 
the locations of all global and local symmetry axes of viral capsids compatible with 
icosahedral symmetry [15]. Once these axes have been identified from group theoretical 
considerations for a viral capsid of given triangulation number T , it is easy to design 
spherical tiles with appropriate decorations [j which pave the capsid and keep track of 
the cluster distribution of proteins around the symmetry axes of the capsid, while also 
encoding the bond structure between those proteins. 

There exist several ways to decorate the prototiles of a given tiling with dots representing 
individual capsid proteins, and still capture their symmetry properties under the global 
5-, 3- and 2-fold rotations of the icosahedron. However, some decorating patterns are 
distinguished inasmuch as they correspond to capsid protein distributions with maximal 
symmetry. By this we mean that the distribution of dots on each prototile of such ideal 
tilings exhibits the highest symmetry compatible with the tile shape. In what follows, 
we describe ideal tilings for two types of T = 3 and T = 7 viral capsids, as they will be 
used as references when considering the experimental data for viruses representatives of 
these triangulation numbers. 

Our choice of T = 3 and T = 7 capsids is motivated by our wish to gain information on 
whether or not qualitative differences in vibrational patterns are rooted in the nature of 
the tiling considered (at fixed capsid size) and are independent of the capsid's size. 

2.1 T = 3 tilings 

The T = 3 icosahedral capsids come in three mathematical species, of which two can be 
used to model viruses observed in vivo. The corresponding tessellations are the Caspar- 
Klug (CK) tiling, with triangular prototiles encoding trimer interactions, the rhomb 
tiling with prototiles in the shape of rhombs representing dimer interactions, and a kite 
tiling whose building blocks encode trimer interactions as in the CK case. We shall not 
discuss this latter case any further as we are not aware of any T = 3 virus which could 
be modelled in this way. 

The capsid proteins are organised in pentamers (clusters of five) about the 12 global 
5-fold symmetry axes, and hexamers (clusters of six) about 20 local 6-fold symmetry 
axes, which coincide with the 20 global 3-fold symmetry axes of the icosahedron. The 
total number of proteins is therefore 180 = 12 x 5 + 20 x 6. The ideal tilings shown 
on a planar representation of the icosahedron in Fig. [T] and Fig. [2] take into account the 

^Decorations are dots on the tiles indicating schematically the locations of the capsid proteins. 



fact that T = 3 capsid proteins belong to three different chains |j - symbohsed by three 
different colours - but their location on the prototiles does not represent certain aspects 
of the experimental data, such as the exact locations of the centres of mass of the capsid 
proteins, as will be discussed in the following section. 




Figure 1: The ideal Caspar- Klug tiling for the T = 3 viral capsid with the location of 
proteins in chains A, B and C marked with purple, cyan and white dots respectively . The 
grey shaded triangular prototiles highlight trimer interactions between capsid proteins, 
while the blue shaded region corresponds to the fundamental domain of the proper rotation 
subgroup of the full icosahedral group H^ . 

It is interesting to note that the CK T = 3 ideal tiling does not exhibit a centre of 
inversion when considering the action of the icosahedral group on individual proteins, as 
is most obvious in the structure of the fundamental domain of the subgroup X shaded 
in blue on the figures: a centre of inversion does exist if all dots in the kite-shape 
fundamental domain are invariant under a reflection through the unique axis of symmetry 
of the kite. This fails to happen in the CK tiling if the white and purple dots represent 
proteins of different types and masses, which is usually the case. For example, the 
masses of the three proteins of the Tomato Bushy Stunt Virus are (in atomic mass units) 
28149.5, 28067.4 and 31311.5 respectively. There is hence a non-negligible difference 
of 3244.1 atomic mass units between the two proteins constituting the hexamer. The 
situation is different however for the rhomb tiling of Fig. [2| where a centre of inversion 
of the ideal tiling is present, and the fundamental domain in this case is reduced to half 
the kite of Fig. [T] cut through the symmetry axis. 



2.2 T = 7 tilings 

We now construct two ideal tilings for T = 7 capsids. The first one has left chirality 
(T = 7£) and accommodates 420 capsid proteins, with four types of dimer interactions 
modelled by rhomb prototiles as shown in Fig. [3j Note that the fundamental domain 
shaded in blue indicates this tiling does not have a centre of inversion, since the distri- 
bution of dots within it is not symmetrical under a reflection through the symmetry axis 
of the kite-shaped domain. 



^Polypeptide chains representing capsid proteins. 



Figure 2: The ideal rhomb tiling for the T = 3 viral capsid with the location of proteins 
in chains A, B and C marked with purple, cyan and white dots respectively. The grey 
shaded rhombic prototiles highlight dimer interactions between capsid proteins, while the 
blue shaded region corresponds to the fundamental domain of the full icosahedral group 
Hs . 

The second capsid is of right chirahty (T = 7d) but only accommodates 360 capsid 
proteins organised in pentamers through two types of prototiles, namely rhombs encoding 
two types of dimer interactions, and kites, encoding trimer interactions. Fig. |4] clearly 
shows that this ideal capsid does not have a centre of inversion. We now confront our 
ideal tilings with experimental data for T = 3 and T = 7 viruses and draw conclusions 
on actual capsid symmetries. 



3 Experimental data and viral capsid symmetries 

In order to extract qualitative features of vibrational patterns from viral capsids, we 
first restrict ourselves to a coarse-grained approximation where each capsid protein is 
replaced by a point mass whose location coincides with the centre of mass of the protein 
considered. This centre of mass is calculated by taking into consideration all crystallo- 
graphically identified atoms of the protein, according to data stored in the Protein Data 
Bank or equivalently the VIPER website. 

We then quantify the deviation between the capsid protein distribution obtained in this 
approximation and the distribution obtained from it by inversion. At the light of this 
information, we finally discuss, for each virus considered, whether the capsid exhibits a 
centre of inversion in good approximation. 



3.1 Bacteriophage MS2 and Tomato Bushy Stunt Virus 

We start by considering the case of MS2, a T = 3 capsid with 180 capsid proteins 
organised in a first approximation according to a rhomb tiling. Based on the .pdb (or 
.vdb) file with ID 2ms2r| we represent in Fig. ^ the centre of mass of the A, B and C 
chains in purple, cyan and white respectively. 



''See http://viperdb.scripps.edu/index.php or http://www.rcsb.org/pdb/home/home.do 




Figure 3: The ideal rhomb tiling for the T = 7i viral capsid with the location of proteins 
in chains A, B, C, D, E, F and G marked with purple, cyan, white, green, pink, magenta 
and yellow dots respectively. The blue shaded region corresponds to the fundamental 
domain of the proper rotation subgroup of the full icosahedral group H^ . 
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Figure 4: The ideal rhomb and kite tiling for the T = 7d all-pentamer capsid with the 
location of proteins in chains A, B, C, D, E, and F marked with purple, cyan, white, 
green, pink and magenta dots respectively. The blue shaded region corresponds to the 
fundamental domain of the proper rotation subgroup of the full icosahedral group H^. 



We use this information to derive the quahtative locations of these vertices with respect 
to a T = 3 rhomb tihng. The result is shown for an icosahedral triangle in Fig. [sFc). 
In Fig. Isla) and (b), we show in black some of the vertices obtained by inversion from 
vertices located on the far side of the capsid. The distances between these black vertices 
and the vertices to which they are connected by a black line correspond to twice the 
deviation from inversion symmetry for these vertices. The deviation from inversion 
symmetry can be determined for each chain, and we obtain the figures given in Table [T] 
Although Bacteriophage MS2 is determined at a resolution of 2.8 A, we argue that the 
small deviations in Table [T] are consistent with the assumption that the viral capsid has 




(b) 





Figure 5: The calculated centres of mass for the capsid proteins in chains A, B and C 
of Bacteriophage MS2 are represented by big dots in purple, cyan and white respectively, 
while the small black dots correspond to the distribution of centres of mass after inversion. 
Views from above a 5-fold (a) and a 3-fold (b) symmetry axes are provided; (c) Rhomb 
tiling decorations induced by experimental data. 



an approximate centre of inversion. 

We thus determine symmetry-corrected versions of the tihng by averaging the position 
of each protein with that of an inverted one (of the same chain) in its immediate neigh- 
bourhood. 

For MS2, there are two inverted candidates in the neighbourhood of a C chain protein, 
one corresponding to the nearest, and the other to the next-to- nearest neighbour. The 



chain 


colour 


deviation from inversion symmetry 
(in Angstr0m) 


A 
B 

C 


purple 

cyan 
white 


5.77 
5.48 
5.00 



Table 1: The deviation from inversion symmetry (in Angstr0m) for MS2. 

first corresponds to the deviations quoted in Table [T] while the second corresponds to a 
deviation of 5.92 A for the C chain. We choose to consider this second option as well 
because it is not much larger than the maximal deviation for this virus, which is 5.77 
(see the entry for the A chain in Table II]). Fig. |6] and Fig. [T] illustrate capsids with 
the two types of inversion symmetry corrections, and the corresponding rhombic tilings 
which will be used in Section 4 to calculate vibrational patterns. Note that the tiling in 
Fig. ItFc) coincides with the ideal tiling in Fig. [2] 

The case of T = 3 CK type Tomato Bushy Stunt Virus is straightforward: the analysis 
of data, based on the file 2TBV.pdb, shows that the capsid is quite further away from a 
centre of inversion situation, as can be deduced from the deviations presented in Table [2] 
These were calculated by averaging the position of each protein with that of the inverted 
one (of the same type) closest to it. The resolution at which TBSV is determined is 
2.9A. Table [2J shows that the deviations for TBSV are much larger than those for MS2: 
whilst the deviation is below QA for MS2, it is about lOA for TBSV. Recall from Section 
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Figure 6: Symmetry- corrected MS2 capsids of Type 1 (a) along a 5-fold axis, (b) along a 
3-fold axis, (c) corresponding tiling decorations. 
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Figure 7: Symmetry- corrected MS2 capsids of Type 2 (a) along a 5-fold axis, (b) along a 
3-fold axis, (c) corresponding tiling decorations. 

2 that the T = 3 CK ideal tiling of Fig. [T] does usually not exhibit a centre of inversion 
either, because the proteins constituting the hexamers differ. 

We also provide in Fig. [8] the tiling induced from the experimental data, where the dots 
represent the centres of mass of individual proteins. 



chain 


colour 


deviation from inversion symmetry 
(in Angstr0m) 


A 
B 

C 


purple 

cyan 
white 


9.66 
10.03 

8.86 



Table 2: The deviation from inversion symmetry (in Angstr0m) for TBSV. 



3.2 Simian Virus 40 and Bacteriophage HK97 

We now turn our attention to SV40, aT = 7d capsid with 360 capsid proteins organised 
in first approximation according to the rhomb and kite tiling presented in Fig. |4} This is 
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Figure 8: The triangular tiling and its decorations for TBSV at the light of experimental 
data. 



an all-pentamer capsid consisting of six chains. Based on the ISVA.vdb file, we represent 
in Fig. |9]the centres of mass of the A, B, C, D, E and F chains in purple, cyan, white, 
green, pink and magenta respectively. 




(c) 




Figure 9: The calculated centres of mass for the capsid proteins in chains A, B, C, D, 
E and F of Simian Virus 40 are represented by big dots in purple, cyan, white, green, 
pink and magenta respectively, while the small black dots correspond to the distribution of 
centres of mass after inversion. Views from above a 5-fold (a) and a 3-fold (b) symmetry 
axes are provided; (c) Rhomb and kite tiling shown in comparison with experimental 
data. Red dots represent axes of five-fold and black dots axes of three-fold icosahedral 
symmetry. 



We use this information to derive the qualitative locations of these vertices with respect 
to a T = 7(i rhomb and kite tiling. The result is shown in Fig. [9tc). In Fig. [oFa) and 
(b), we show in black some of the vertices obtained by inversion from vertices located on 
the far side of the capsid. The distances between these black vertices and the vertices to 
which they are connected by a black line correspond to twice the deviation from inversion 
symmetry for these vertices. 

The deviation from inversion symmetry can be determined for each chain, and we obtain 
the figures given in Table [3j Although the resolution at which SV40 is determined is 3.1 
A, we argue that the small deviations in Table p^ are consistent with the assumption that 
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the viral capsid has an approximate centre of inversion. 



chain 


colour 


deviation from inversion symmetry 
(in Angstr0m) 


A 


purple 


7.07 


B 


cyan 


4.99 


C 


white 


3.06 


D 


green 


0.70 


E 


pink 


3.25 


F 


magenta 


7.64 



Table 3: The deviation from inversion symmetry (in Angstr0m) for SV40. 



As in the case of Bacteriophage MS2, we calculate the symmetry-corrected version of 
the tiling by averaging the position of each protein with that of the inverted one (of the 
same chain) closest to it. The results are presented in Fig. 
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(c)^ 




Figure 10: The symmetry corrected versions for SV4O: 3d representation of the location 
of proteins centres of mass in a five- fold (a) and three-fold (h) view; (c) the corresponding 
rhomb and kite tiling and its decorations. 

We end with Bacteriophage HK97, whose T = 7£ capsid has 420 proteins belonging to 
7 different chains, organised in pentamers around the global 5-fold symmetry axes of 
the underlying icosahedron and hexamers everywhere else. Using the data from the file 
2fte.vdb, we have again calculated the centres of mass of all capsid proteins involved and 
compared their distribution with the distribution obtained by inversion. Table |4] lists 
the deviations of each chain from a distribution with inversion symmetry, and, given the 
relatively large numbers, one cannot reasonably argue that the HK97 capsid has a centre 
of inversion, even approximately. 



4 Group theoretical properties of Raman modes 

Our aim is to provide qualitative information on the vibrational modes of the viral 
capsids described in the previous section. As explained there, we work within a coarse- 
grain approximation which consists in substituting each capsid protein by a point mass. 
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chain 


colour 


deviation from inversion symmetry 
(in Angstr0m) 


A 


purple 


16.64 


B 


cyan 


2.33 


C 


white 


10.92 


D 


green 


17.84 


E 


pink 


13.3 


F 


magenta 


0.61 


G 


blue 


0.61 



Table 4: The deviation from inversion symmetry (in Angstr0m) for HK97. 



This point mass is located either at the centre of mass of the given protein, or at the 
midpoint between its centre of mass and that of an inverted protein of the same chain 
in its immediate neighbourhood. The former approximation is chosen when the capsid 
protein distribution deviates significantly from a distribution invariant under inversion 
(TBSV and HK97), while the latter is adopted for MS2 and SV40 as their capsid protein 
distributions are inversion invariant to good approximation. This particular coarse- 
graining reduces considerably the number of degrees of freedom which must be treated 
in the vibration analysis. Given a virus with A^ capsid proteins, the system depends on 
3A^ degrees of freedom. Strictly speaking, our capsids are distributions of point masses 
calculated according to one of the two prescriptions above, but we will, in the remainder 
of this paper, speak of distributions of capsid proteins although the word 'protein' here 
must be taken in a loose sense. 

Precise numerical values for the frequencies of normal modes of vibrations can only be 
obtained by diagonalisation of the force matrix, which encodes the interactions between 
the point particles of our approximation. Group theory provides an elegant route to the 
diagonalisation [llj, and together with Viral Tiling ideas, offers valuable insights in the 
description of patterns of normal modes, as we proceed to show. 

We pay particular attention to the normal modes of vibration which can be detected via 
Raman spectroscopy ^12j, as this technique is, unlike infrared spectroscopy, well-suited 
for measuring frequency spectra in aqueous samples such as living organisms. 

The following group theoretical considerations are standard and can be found, for in- 
stance, in Il6j. 



4.1 Displacement representation of H^ or X for generic capsids 

The first step in this analysis involves the decomposition of the displacement represen- 
tation of the capsid into a direct sum of irreducible representations of the icosahedral 
group H3 or its proper rotation subgroup I. 

The former is used whenever the capsid has an approximate centre of inversion, and the 
latter when it does not. 

The displacement representations of the icosahedral group H^ (resp. of its subgroup 
X) for viruses or phages with N capsid proteins consist of 120 (resp. 60) matrices 
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^3n' {9)1 9 ^ H3 (resp.T) of size 3A^ x 3A^, which encode how proteins are interchanged 
under the action of each element g, as well as how the displacements of each protein 
from the equilibrium position are rotated under the action of g. The latter information 
is gathered in 3 x 3 rotation matrices R{g) which form an irreducible representation of 
H3 (resp. X), while the former is encoded in permutation matrices P{g) of size N x N, 
so that we have 

Tffig) = Pig) ® R{g), g E H, (resp.X). (4.1) 

The permutation matrices P{g) act on vectors whose components are the vector positions 
f*j^,2 = 1,..,A^ of the N proteins at equilibrium. The entry Pij{g) of the permutation 
matrix is 1 if f,'^ is mapped on r/ by g, and is zero otherwise. 

4.2 Decomposition of r^**^^ into irreducible representations 

There exists a matrix U - whose detailed form is not needed for our present analy- 
sis - which transforms the displacement representation into a finite sum of irreducible 
representations F^ of H3 (resp. X): 

UTffig)U-' = Tff \g) = ®,n,T^{g), (4.2) 

where the multiplicities Up are obtained via the following character formula 

p^ Yl X'"^'(^)* Xn9) or rip = -i- J^ x''^"'(^)* x'{9)- (4-3) 






^ di 



9&H3 gel 



The characters x^(5') of irreducible representations of the icosahedral group are listed in 
Table pi while the characters of the displacement representations x'^^^^\9) ^^re obtained 
by inspection of the displacement representation considered. Note that, in view of the 
very definition of the permutation matrices P{g) given in the previous subsection, and 
the fact that the characters of a representation are the traces of its constituent matrices, 
one has 

x'^^^^\9) = Tr {P{g)) Tr {R{g)) = ± (number of proteins unmoved by (?) ■ (1 + 2cos6'), 

(4.4) 
where 6 is the angle of the proper rotation associated with g, and the minus sign is taken 
when g E H^\X. So x'^^^^\9) is zero when ^ = ^ or whenever g is such that no protein 
of a given capsid is kept fixed under its action. 

The decomposition of the displacement representation of a given capsid boils down to 



the knowledge of the coefficients rip in (4.3) which, in view of the expression (4.4), are 



non zero whenever at least one capsid protein is unmoved under the action of an element 
g (and 6 ^ f ). 

Before dwelling into the particulars of the four capsids we have chosen to analyse here, let 
us enumerate a few properties of generic icosahedral capsids relevant to the calculation 
of the decomposition coefficients n^. 

We set the origin of coordinates at the centre of the icosahedron which supports the 
distribution of capsid proteins, so that all symmetry axes go through (0, 0, 0). 

12 



Property 1: Consider a virus shell with N capsid proteins distributed according to 
icosahedral symmetry. Then the identity element e of the icosahedral group keeps all N 
proteins trivially unmoved and Tr P{e) = N. The corresponding rotation R{e) has trace 
+3 and ^'^''^'(e) = 3N. 

Property 2: A capsid protein is unmoved by a proper 5-fold, 3-fold or 2-fold rotation 
if and only if it is located on the 5-, 3- or 2-fold symmetry axis respectively. 

Property 3: Consider a viral capsid exhibiting invariance under inversion, and identify 
an arbitrary 2-fold axis of the capsid. Then any capsid protein lying in the plane orthog- 
onal to that 2-fold axis is unmoved by the element 5'ofl'2 = 9290 which consists in a 2-fold 
rotation about the chosen 2-fold axis followed by an inversion, or vice-versa. 



Fig. 11 shows the 12 vertices of an icosahedron as the vertices of three golden rectangles 
orthogonal to each other. A 2-fold axis is highlighted in red and an hexagonal portion 
of the plane orthogonal to that axis is shaded in red . The perimeter of the hexagon is 
the intersection of the icosahedron with the orthogonal plane, and proteins unmoved by 
gog2 must be located on this perimeter. 




Figure 11: (a) An hexagonal portion of the plane orthogonal to a given (red) 2-fold 
symmetry axis is shaded in red in a representation of the icosahedron involving three 
orthogonal golden rectangles, with the icosahedron suprimposed in blue, (h) The perimeter 
of the hexagonal portion projected on a planar template of the icosahedron. The red dots 
represent the intersection of the 2-fold axis with the icosahedron. 



4.3 Raman active modes of the four viral capsids 

Let us now look at the details of the decomposition into irreducible representations of 
the displacement representation of the four chosen capsids. 

MS2 type 1: this capsid has an approximate centre of inversion and therefore, we use 
the full icosahedral group H^^. According to the decorations in Fig. |6] (c): 



No protein is located on a 5-, 3- or 2-fold symmetry axis, hence no proper rotation 
bar the identity leaves any protein unmoved. The identity e however keeps all 180 
proteins trivially unmoved and x(-P(e)) = TrP(e) = 180. Correspondingly, the 
rotation R{e) has trace +3 and ;)^'^"P^(e) = 540. 
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(ii) All proteins on the plane orthogonal to a given 2-fold symmetry axis are distributed 
according to Fig [l2|^a). There are 12 proteins on that particular plane, hence 
x{P{9og2)) = TrP(gog2) = 12 while x(i?(^o^2)) = TrR(gog2) = -Ix (l+2cos7r) = 
1, so that x'^^^^'' (9092) = 12 X 1 = 12. There are 15 ways to choose a 2-fold axis on 
the icosahedron, hence 15 elements of if 3 are of type gog2, which must be accounted 



for when calculating the coefficients Up from (4.3) 




Figure 12: (a) The 12 MS2 capsid proteins located on the plane orthogonal to a 2-fold 
symmetry axis represented by the red dots, when considering the symmetry - corrected 
version pictured in Fig. \^c); (h) The 12 MS2 capsid proteins located on the plane or- 
thogonal to a 2-fold symmetry axis represented by the red dots, when considering the 
symmetry - corrected version pictured in Fig. ^c). 

(iii) No other element of H^ leaves any capsid protein of Bacteriophage MS2 type 1 
unmoved, so that the coefficients are given by 



n. 



MS21 



120 



{540 x^(e) + 180x^(^0^2)}, P 



l±,3-|-,3_|_,4-|-,5-|-. 



(4.5) 



Using the information in Table |5| one obtains the following decomposition of 
the displacement representation into irreducible representations of H^, labelled 



pi r3 ps' r4 rs 



^displ ' 



540,AfS'21 



er^ + i2r^ + i2r^ + isr^ + 24r^ + ari + i5r^ + i5r^ + isri + 2irL 

(4.6) 
The above describes all modes of vibration of the T = 3 icosahedral capsid mod- 
elling Bacteriophage MS2 type 1, including rotations and translations of the capsid 
as a whole. These six degrees of freedom belong to the Fi and Y^_ irreducible repre- 



sentations of ifs respectively, and must be subtracted from (4.6) in order to classify 
the normal modes of vibration of the capsid. We thus have 



^ 540,MS'21 



6F^ + IIF^ + 12F J + 18F^ + 24F^ + 3Fi + 14F^ + 15F^' + 18Fi + 21FL 

(4.7) 
It remains to identify, among the above modes, those which are potentially Raman 
active. Recall that Raman spectroscopy involves placing a molecule (here a virus), 
which vibrates at frequencies v internal-, in a time- varying electric field produced, 
for instance, by a monochromatic laser of frequency viaser- One then looks for 
frequency shifts viaser ± ^internal in the light which scatters inelastically from the 
molecule. Such shifts only occur if the internal motion of the molecule induces a 
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change in its polarizability, which is a 2-rank symmetric tensor aij,i,j = 1,2,3 
entering in the definition of the dipole moment /i, of the molecule induced by the 
presence of an electric field Ei {fii = cuijEj) [17j. Given that the components of the 
polarizability tensor transform asHx^ {i = 1,2,3) and XjXj, which belong to the 
r^ and the F^ irreducible representations of if 3 (see Table ^, we have 

rSrM52i = 6r^ + 24r^+, (4.8) 

i.e. the Raman active fundamental levels are the six non-degenerate levels belong- 
ing to r^ and the twenty-four five-degenerate levels belonging to F^. The infrared 
active modes belong to Fj^ and T^, and hence are completely decoupled from the 
Raman active modes, as expected from a capsid with a centre of inversion. 



MS2 type 2: The analysis leads to the same decomposition pattern as the MS2 type 
1. According to the decorations in Fig. [T] (c), the number of proteins left unmoved by 
elements of H^ of type (y'o5'2 for a given 2-fold rotation is 12 again, although their distri- 
bution on the orthogonal plane is different from the previous case. This does not affect 
the qualitative analysis of Raman modes, but may be of importance when calculating 
the frequencies of the normal modes of vibration [llj. We thus have, 

and 

-pRaman -pRaman (a i n\ 

'- 540,M522 — J- 540,^/521 • l^--'-UJ 



TBSV: This viral capsid does not exhibit a centre of inversion, even approximate, and 
the relevant group for our analysis is the subgroup X of the full icosahedral group. The 
only element which leaves proteins unmoved is the identity, under which all 180 proteins 
are unmoved. The coefficients of the decomposition of the displacement representation 
in this case are 



n, 



TBSV 

p 



1 
60 



{540x^(e))}, p=l+,3+,3'+,4+,5+. (4.11) 



We therefore have the decomposition [j 



^floTBSv = 9r+ + 27Tl + 27Tl + 36Ti + 45F^, (4.12) 



® Xi,i = 1, 2, 3 are the coordinates of a point in 3-space. 

^ This is a special case of the formula giving the decomposition of the 3A^-dimensional displacement 
representation of a viral capsid with N capsid proteins into irreducible representations of the group of 
proper rotations X, namely 

3N 



^m"' ' = ^ {r+ + 3r+ + sr^: + 4r^ + sr^ } 
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and once the rotations and translations are subtracted, we obtain 

^tto,TBSv = er^ + 25Tl + 27Tl + 361^ + 451^. (4. 13) 

The Raman modes of vibrations are 

r5lrTw = 6rU45n> (4-14) 

i.e. the Raman active fundamental levels are the six non-degenerate levels belonging to 
r^ and the forty-five five-degenerate levels belonging to F^. These modes are not infrared 
active. Note that the Raman signatures of Bacteriophage MS2 and Tomato Bushy Stunt 
Virus are different, due to the existence of an approximate centre of inversion for MS 2, 
but not for TBSV. 



SV4O: We argued this viral capsid has an approximate centre of inversion, and we 
thus perform our analysis with the full icosahedral group. Besides the identity, which 
leaves all 360 capsid proteins unmoved, we learn from the symmetry-corrected tiling in 
Fig. [Io|(c) that there are 24 capsid proteins lying on the plane orthogonal to a given 



2-fold symmetry axis, as can be observed in Fig. 13 




Figure 13: The 24 SV40 capsid proteins located on the plane orthogonal to a 2-fold 
symmetry axis represented by the red dots, when considering the symmetry - corrected 



version pictured in Fig. 10(c) 



Hence the coefficients Up are given by. 



Up 

1 



n7'" = — {1080x^(e) +360x^(^0^2)}, P = 1±,3±, 3'±,4±, 5±. (4.15) 

Once again, using the information in the character table [5| we obtain the following 
decomposition of the displacement representation, 

riofo,5V4o = i2r^+24r;^+24rJ+36r^+48r^+6r^+3or3 +3or=L'+36r^ +42rL (4.16) 

The genuine modes of vibrations are encoded in the following, 

^wm,svm = 12r^+23r^+24Fi^+36r^+48r^+6ri+29r3 +30rl'+36r^ +42ri, (4.17) 
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while the Raman modes of vibration are 

rfo~y4o = i2r^ + 48r^, (4-18) 

i.e. the Raman active fundamental levels are the twelve non-degenerate levels belonging 
to r^ and the forty-eight five-degenerate levels belonging to F^. These Raman modes 
are not infrared active as expected. 



HK97: This viral capsid does not exhibit a centre of inversion, even approximate, and 
the relevant group for our analysis is the subgroup X of the full icosahedral group. The 
only element which leaves proteins unmoved is the identity, under which all 420 proteins 
are unmoved. The coefficients of the decomposition of the displacement representation 
in this case are 

<'''' = ^{1260x^6))}, p=l+,3+,3'„4+,5+. (4.19) 

We therefore have the decomposition 

rS,Hi^97 = 2ir^ + 63r=^ + Qd,Tl + mtX + losr^, (4.20) 

and once the rotations and translations are subtracted, we obtain 

^i2m,HK97 = 2ir^ + 6ir^ + 63r^ + 84r^ + losr^. (4.21) 

The Raman modes of vibrations are 

^iZ:Sm7 = 2iri + losr^, (4.22) 

i.e. the Raman active fundamental levels are the twenty-one non-degenerate levels be- 
longing to T\ and the one hundred and five five-degenerate levels belonging to F^. These 
modes are not infrared active. 



5 Conclusion 

A full analysis of the vibrational modes of icosahedral viral capsids is a challenging task, 
as the number of degrees of freedom for a capsid consisting of n atoms is 3n. Even in the 
framework of classical mechanics, and within the approximation of a harmonic potential 
to describe interactions between atoms, a brute force calculation requires increasingly 
prohibitive CPU times as the capsid's size grows. There are a handful of impressive 
results in the literature where various levels of coarse-graining have been implemented, 
and where use of icosahedral symmetry has helped reducing the computer time needed 
to obtain the frequencies of the lowest modes of vibration for several capsids up to 
triangulation number T = 7 [31 H]. Although these calculations are extremely valuable, 
they offer little insight in patterns of vibrations across the spectrum of capsids. 

Our motivation has been to step back from these molecular dynamics calculations, and 
re-examine if the underlying group theory, combined with Viral Tiling Theory, could 
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provide new insights in the vibration patterns of icosahedral capsids. The first step 
in our approximation procedure has consisted in replacing each capsid protein by a 
point mass whose location coincides with that of the centre of mass of the protein in 
question, calculated from all crystallographically identified constituent atoms. This level 
of coarse-graining may appear very crude, but we have shown here that it highlights 
features of the effective distribution of proteins on the capsid, that are buried in the 
existing analyses. These features emerged when we analyzed the deviation between the 
distribution of point masses described above and the distribution obtained by taking the 
(additive) inverse of all point masses. Although strictly speaking, none of the known 
icosahedral viral capsids enjoys a centre of inversion in vivo, some are remarkably close 
to possessing one, as can be quantified from the measured deviations. We found that 
the largest deviation between the actual location of a centre of mass and its would-be 
location, were there an exact centre of inversion for Bacteriophage MS2 (T = 3), occurs 
in chain A and is 5.77A for a capsid determined at a resolution of 2.8A. Similarly, the 
largest deviation for Simian Virus 40 (pseudo T = 7d) occurs for chain F at 7.Q4A for 
a capsid determined at a resolution of 3.1A. These deviations are substantially smaller 
than the corresponding ones in chain B of Tomato Bushy Stunt Virus (T = 3) where 
the deviation is 10.03A at a resolution of 2.9A, and in chain D of Bacteriophage Hong 
Kong 97, where it is 17.84A. At the light of these data, we have considered that MS2 
and SV40 had effective centres of inversion, as can be anticipated from their ideal tiling 
representations in Fig. [2] and Fig. |4| Indeed, for MS2, the ideal tiling has a centre 
of inversion, while the SV40, although not possessing one, has decorations which are 
highly compatible with invariance under inversion, as the six chains in the kite-shaped 
fundamental domain (of the proper rotation subgroup X) are nearly on the edges of the 
half-kite- shaped fundamental domain of the full icosahedral group H^ represented in 
Fig. [lO|c). We therefore conclude that the underlying tiling and its decorations can help 
identify those capsids with an effective centre of inversion, and hence provide a way to 
qualitatively differentiate between vibrational patterns, as the underlying group theory 
yields subtle variations, particularly in the number of normal modes which are Raman 
active. 

Our analysis clarifies that the property for a viral capsid of having a centre of inversion 
is neither correlated with its size, nor with a type of tiling. For instance, MS2 and HK97 
are both modelled by a rhomb tiling, but the former has a centre of inversion while the 
latter does not. The important factor is how the capsid proteins' centres of mass are 
distributed on the tiles within the fundamental domain of the icosahedral group. As for 
the level of coarse-graining adopted here, we believe it provides a reasonable estimate of 
the group theoretical properties of the lowest modes of vibrations, and will yield useful 
information on the actual frequencies of these modes |llj . 

The methods developed here can easily be used for the analysis of any known capsid, 
irrespective of its size, and should provide valuable information on its lowest frequency 
modes of vibration. 
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Appendix 



Consider Fig. [ll[a) and (b). The 12 vertices of the icosahedron can be thought of as the 
vertices of three golden rectangles which are perpendicular to each other. If we choose a 
right handed Cartesian reference frame in Fig. [lT|(b), and scale the icosahedron in such 
a way that its edges have length 2, the three golden rectangles have vertices [ABCD], 
[EFGH] and [IJKL] with the coordinates being 



A=iO,-l,r), 


E = (l,r,0), 


/=(r,0,l). 


5 = (0,l,r), 


F=(-l,r,0), 


J=(r,0,-1), 


C=(0,-l,-r), 


G=(-l,-r,0), 


i^=(-r, 0,1), 


D = (0,1, -r), 


iJ=(l,-r,0), 


L = (-r,0,-i; 



where r = ^(1 + v^). 

We choose the elements g2 and g^ as follows (see also, for instance, 

1. (72 is the 2-fold rotation taken clockwise about the axis through the origin and the 
midpoint of the segment [CH], given by ^(1, — (r + 1), — r). The unit vector along 
this axis is ^{—t', — r, —1). 

2. Qs is the 3-fold rotation taken clockwise about the axis through the origin and the 
centroid of the triangle [CHJ], given by |(r + 1, — (r + 1), — (r + 1)). The unit 
vector along this axis is |(1,— 1,— 1). 

The full icosahedral group H^ has 120 elements and is generated by the elements g2,g3 
and go- The first two are well-chosen 2- fold and 3-fold rotations satisfying g2 = g^ = 1 
and (5'2fi'3)^ = 1- They generate the group of proper rotations / of order 60. 

To obtain H^, one adds the reflection through the origin (or inversion) go such that 

9l = 1- 

The elements of H3 are organized in ten conjugacy classes, and hence there exist ten 
irreducible representations whose characters are given in Table |5} 
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